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With the advancement of the internet, individuals are becoming more reliant 
on online applications to meet most of their needs. In the meantime, they 
have very little spare time to devote to the selection and decision-making 
process. As a result, the need for recommender systems to help tackle this 
problem is expanding. Recommender systems successfully provide 
consumers with individualized recommendations on a variety of goods, 
simplifying their duties. The goal of this research is to create a recommender 
system for farmers based on tree data structures. Recommender system has 
become interesting research by simplifying and saving time in the 
decision-making process of users. We conducted although a lot of research 
in various fields, there are insufficient in the agriculture sector. This issue is 
more necessary for farmers in Quangnam-Danang or all Vietnam countries 
by severe climate features. Storm from that, this research designs a system 
based on tree data structures. The proposed model combines the you only 
look once (YOLO) algorithm in a convolutional neural network (CNN) 


model with a similarity tree in computing similarity. By experiments on 400 
samples and evaluating precision, accuracy, and the value of the predictive 
test as determined by its positive predictive value (PPV), the research proves 
that the proposed model is feasible and gain better results compared with 
other state-of-the-art models. 
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1. INTRODUCTION 

We live in an age where the internet, intranets, and e-commerce platforms are widespread, there is 
an abundance of information available that we rarely use. People are increasingly turning to web applications 
to meet their needs since they provide more possibilities for selecting a certain product [1]. However, people 
may find it challenging to choose a suitable item from a huge pool of options that meets their needs. As a 
result, online apps must be accountable for providing strong recommender systems that can assist users. 
Recommender systems aid in product selection by identifying user interests and preferences [2]-[5]. 
e-commerce, health, agriculture, banking, social media, education, and sports all benefit from using buyer 
research. The agricultural sector is an important role for the economy ratio of developing countries. Although 
the sector seems to have plenty of advantages, it is plagued by issues such as climate change, rainy seasons 
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that are not as regular as they used to be, droughts, floods, and farmers leaving their homes for better-paying 
jobs in cities. 

Hybrid recommender systems include both collaborative and content-based approaches. Hybrid 
recommender systems often perform better than standard collaborative and content-based systems, according 
to numerous studies [6]-[8]. When knowledge-based recommender systems are used, there is no need for 
rating data in order to obtain suggestions. Based on the knowledge base, the systems provide a list of 
suggestions. Both case-based and constraint-based systems employ knowledge-based. Knowledge is most 
useful when it is incorporated into constraint-based and case-based systems in diverse ways. Suggestions are 
provided based on user demographic information that may be obtained when the user’s age, gender, 
nationality, and location are verified to be comparable [9]. The recommendation system is shown in general 
scheme Figure 1. 


Recommender System 


Content-Based RS Collaborative RS Knowledge-based RS Hybrid RS 


Memory-based RS Model-based RS 


User-based Item-based 


Figure 1. Types of recommender systems 


Revolution 4.0 is a highly combined physical and digital hyper-connected systems with a focus on 
the internet, internet of things (IoT) and artificial intelligence (AI), which creates entirely new production 
possibilities and has a profound impact on the economic, political and social life around the world. This 
fourth industrial revolution has four major features, one of which is AI and cybernetics, which allows people 
to control remotely without regard for space or time constraints, as well as interact in a faster and more 
accurate manner [10]-[12]. An overview of the types of recommendation systems being implemented today 
can be found in Figure 1. 

The rest of this paper is organized: the related works are listed in section 2. The problem definition, 
system architecture and method experiment are declared in sections 3. The authors discuss about results in 
section 4 and provide a brief conclusion in section 5. 

Recommender systems assist users in selecting products based on their particular preferences. 
Machine learning approaches using tree-based and fuzzy logic, as well as other recommender systems for the 
agricultural industry, are all reviewed in the literature study [5], [8], [9]. Agro consultant is a comprehensive 
intelligence system developed to assist Vietnam and countries in Asia zone farmers make intelligent planting 
selections that are dependent on such weather and environmental aspects as soil quality, farm location, and 
sowing season. Distributed computing is used to assess customer behavioral data, uncover their preferences, 
and provide personalized suggestions to them. The technique developed to predict crop yield and price a 
farmer may make from his property uses a sliding window non-linear regression approach to evaluate 
rainfall, weather, market prices, land acreage, and the yield of previous crops. By using a collaborative 
recommender system that employs fuzzy logic, farmers may gain an early jump in crop production prediction 
[13], [14]. 

AI in general, and computer vision in particular, is one of the key technologies of the fourth 
industrial revolution that scientists are particularly interested in researching. Some of the most well-known 
applications of computer vision are: utilizing large data sets accumulated over time to train recognition 
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models; using machine learning, deep learning techniques to assist in object detection and recognition; 
developing image classification systems [15]—[17]. The proposals and projects listed above demonstrate that, 
in the current agricultural industry, scientists and researchers are very interested in researching and applying 
AI in general and computer vision in particular, especially in the field of using algorithms such as CNN to 
extract image features and attribute descriptors of fruits [4]. 


2. SYSTEM ARCHITECTURE 

Agriculture is critical to the economies of developing countries, as it provides the key foundation of 
food, profits, and employment for rural populations. Agriculture in Vietnam has progressed dramatically 
during the last few decades. Agriculture supports more than 70% of rural households. QuangNam, being 
solitary of India’s states, has an agricultural sector that produces over 30% of the country’s national summary 
data page (NSDP) and employs 73% of the country’s agricultural labour. However, most farmers in Quang 
Nam continue to cultivate in the old manner, oblivious of new herbicides, fertilizers, and instruments 
available for a particular crop. They may have to suffer losses in their farming as a result of these factors, 
which has a significant impact on our society. The results of this study suggest that the authors developed a 
recommender system that suggests pesticides and crop-specific equipment for a farmer who is active in a 
certain crop. This tip also offers viable seeds for other crops depending on soil test findings [4], [10]. 

This system, which is based on the use of computer vision technology, is designed to determine the 
ripening stage of pineapple fruit. The system analyzes data, extracts images and attribute descriptors of 
pineapples using CNN algorithm. In this article, the authors use the YOLOv4 model to improve training 
speed and performance. The system’s architecture is depicted [18]—[21]. 

The system’s input is a dataset of images collected by the authors at various pineapple gardens. 
After an image has been fed into the system using the absolute path, it will be labeled and the features’ 
coordinates will be determined. User will receive the input image as well as information about the growth 
stage as output. 


2.1. Input data 

In this paper, the authors perform pineapple’s growth stages detection, from bud emergence stage to 
senescence stage. Data which is used was collected during the research process. This study has surveyed and 
took pictures at some pineapple gardens. Each data element contains images divided into two sets: one 
training dataset and one validation dataset. It also contains labels that assign location to each photo. The 
datasets are updated attributes such as image name, growth stage, and the pineapple position in the image 
(x_min, y_min, x_max, y_max). Whole dataset includes 36,000 photos about five growth stages of pineapple 
taken at various local pineapple gardens. The model will be trained with 30,000 independently-labeled 
images and evaluated on a testing set of 6,000 images left. 


2.2. Annotate images 

Following the collection of input images, the authors proceed to label each image in order to extract 
the characteristic regions of pineapple. Labeling is done with the Labellmg software. LabelImg is a tool for 
annotating images written in Python and using Qr as its interface. The information is saved as text files (.txt). 
Each file will be named after the image and saved in you only look once (YOLO) format in the specified 
folder. It contains parameters such as class code, x-coordinate, y-coordinate, x’-coordinate, y’-coordinate. 


2.3. Pineapple’s growth stage detection model 
The model is built in the following steps: 

a. Data preprocessing: Annotate images. In order for our detector to learn to detect objects in images, it 
needs to be fed with labeled training data. After finishing the labeling, files are saved in the ‘data’ folder 
for training. 

b. Training: Images are still being collected and labeled automatically in order to improve training coverage 
and scale via reinforcement. Image resolution is increased for greater accuracy. The higher the pixel 
resolution, the more detailed features the model can detect and the more easily image borders, image 
styles, colors can be extracted. The patterns are determined by the outer skin of pineapple and its defects. 
The image features are extracted using the CNN algorithm in the YOLOV4 training model [22]. 

c. Testing: The system will be tested and evaluated after the training phase. The system will take input 
images from the testing set, compare with the extracted characteristic attributes, and then return 
prediction results that include input images and their labels for different pineapple cultivars. Figure 2 
shows that the training set includes images of semi-ripe pineapples and ripe pineapples. 
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Figure 2. Pineapple images used for detection 


3. METHOD 

The system is expected to take an input list of farmers on farms in the Quang Nam Da Nang region, 
Vietnam. There is a list of plant tools and pesticides that we recommend. Depending on the area, growing 
soil, we recommend people to experiment on pineapple. Additionally, we recommend pineapple varieties 
suitable for this area [4], [10]: i) this study employed a questionnaire method in this investigation. The 
questionnaire is initially distributed to the respondents. A total of 400 people took part in the research. This 
study included 310 males and 90 females with an average age of 25.4 years (Mage=30 years, Fage=32.5 years). 
All of the respondents are from Da Nang, Vietnam, in Quang Nam’s west area. The proposed method uses 
the user’s location and crop preferences as input to construct a skewed binary tree. Because it was used to 
compute similarity, this skewed tree is referred to as a similarity tree. The projected model is depicted in 
Figure 3 and ii) a similarity tree is utilized to locate active query farmers comparable to the farmers who have 
comparable characteristics. Figure 1 depicts the topology of the similarity tree and its probable nodes. The 
first branch of the similarity tree is the farmer’s country of origin. Seeds (S1, S2, ..., Sp) are places that are 
immediately adjacent to each other. Pesticides (P1) and instruments (11) are those used in the treatment of 
seeds. States (ST1, ST2, ST3, ..., STn) are the countries that are immediately adjacent to each other. Seeds 
(S1, S2, ..., Sp) are the places that are immediately adjacent to each other. Pesticides (P1) and instruments 
(I1) are the treatment methods used on seeds. States (ST1, ST2, ST3, ..., STn) are the countries that are 
immediately adjacent to each other. 


Country of the farmer 


Ss ST2 ST3 ST4 ST5 ...  STn 
PLI PL2.. PLm 


Figure 3. Structure of the similarity tree 


Using the aforementioned structure, a similarity tree will be generated for every provided input 
active query farmer. It is revealed that the farmer has a matching preorder traversal. Related preorder 
traversal is also studied in the context of database users. Comparing the active query farmers’ preorder 
traversals to the database users’ preorder traversals yields the database users’ comparable users. Users in the 
database with preorder traversals that match the active user’s are thought to be comparable. The database 
user’s similarity value is set to 1 if the database user’s preorder matches that of the active query farmer 
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[19]-[21]. For example, if an active inquiry farmer specifies rice as his preferred crop and is from Vietnam, 
the state of Da Nang, and the province of Quang Nam, a similarity tree is generated for that user. 

Figure 3 shows similar tree structures in all pineapple-growing areas of Quang Nam-Danang. The 
figure on the right shows the similarity tree for this current query user. A skewed binary search tree is used to 
build the similarity tree. The preorder traversal of the similarity tree is found. These two preorder traversals 
connect to each other in the sense that one happens before to the next preorder traversal. The system assigns 
the similarity value to either 1 or 0, depending on the total number of preorder traversals. This sentence 
means that, given two users with similarity values of | in the database, they are seen to be as similar to the 
nth query user [23], [24]. This process groups customers who have provided rating values for previously 
bought seeds in the order in which they are provided. in order to suggest individuals, the top users will be 
utilized whose total number of recommendations is n (N). 


4. RESULTS AND DISCUSSION 
4.1. Performance assessment 

The suggested system’s performance is assessed using measures such as precision, accuracy, and 
positive predictive values (PPV). Precision can be defined as the percentage of correctly classified positive 
instances among those that are projected to be positive. Predictive positive is another name for precision. The 
proportion of correct classifications out of the total number of cases is called accuracy. 


TP 
TP+FP 


Precision = 


(1) 


Where, TP is true positive, FP is false positive, TN is true negative, and FN is false negative. A good rule of 
thumb for identifying true positives is to consider the proportion of positives that have been detected 
correctly. A tree that grows pineapples is similar to the tree pictured in Figure 4. 


O 


DANANG ` QUANG NAM» 
HOA VANG BAI LOC 


VU'O'N THOM VU'O'N THOM 


Figure 4. Active query farmer similarity tree 


A false negative is said to occur when a symptom occurs when it is not existent. An FP is a case in 
which an attribute or condition is stated improperly. A genuine negative result, which indicates that the 
condition is not present when it is not present, is indicated by a TN result. True positive test findings make up 
the total proportion of true positives (PPV). The PPV shows the diagnostic statistical measure’s performance. 
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If the system is very accurate, then the PPV will be too. Obtaining a false positive is a type one mistake, and 
the false discovery rate (FDR) expresses the likelihood of this happening. Connecting the PPV and FDR is 
done using (1) [20], [22], [25]. 


PPV =FDR=1 (2) 


Before determining whether the suggested method is suitable, 20 crop kinds are tested to find out 
their quality. The database contains 150 user opinions. The seeds for different crops have been rated on a 1-5 
scale. For the five most often selected items from Table 1, the authors computed the precision, accuracy, 
PPV, and FDR values for five distinct users. PPV and FDR as seen in Figure 5 while Figure 6 show the 
precision and accuracy graphs, respectively. 


Table 1. Precision, accuracy, PPV and FDR values of five users 


User ID Precision Accuracy PPV FDR 
1 0.57 0.62 0.75 0.55 
2 0.34 0.75 0.5 0.56 
3 1.21 0.9 1.25 0.6 
4 0.78 0.74 0.9 0.73 
GRAPH FOR PPV & FDR Graph for Precision & Accuracary 
— PPV = FOR 125 5 
£ Precision I Accuracy 
F 
z 


User ID User ID 


Figure 5. Graph for PPV and FDR Figure 6. Graph for precision and accuracary 


4.2. Time complexity analysis 

Traditional approaches such as cosine similarity, pearson correlation coefficient, and Jacard’s 
similarity may be proved to be less time efficient than the suggested technique. The amount of time required 
to find the cosine similarity between two vectors is O(n2), while recommendation of a skewed binary tree is 
O(n), where n is the size of the input. After successfully training, the study used the trained data to perform 
detection on the images that were collected. The study used the data from the 6,000" loop to make the 
comparison. The training result is shown in the diagram in Figure 5. 


Accuracy = TP + TN 
TP+TN+FP+FN (3) 


Through the evaluation diagram, the authors found that the training model achieved good results to be able to 
perform the pineapple’s growth stage detection. The training model correctly identified the pineapples and 
detected their growth stages with an accuracy of more than 95%. 

This study used deep learning techniques, specifically the neuron network model, with the purpose 
is to identify the pineapple and determine its ripening stage, in order to propose optimal care and harvesting 
solutions [24], [25]. As a result, farmers can save time, money, and human resources, while also increasing 
system flexibility. In this article, the authors used YOLOv4 model. The experimental results show that the 
YOLOv4 model achieved high accuracy in determining pineapple ripeness, reaching approximately 95% 
after 6,000 epochs. 


5. CONCLUSION AND FUTURE WORK 

Based on previous choices, a recommendation system predicts a user’s preferences. This technology 
has proven successful across virtually every domain. It is essential to implement effective recommender 
systems in countries like Vietnam, where agriculture comprises a large part of the economy, for farmers to 
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adapt to environmental change and learn about new pesticides and instruments for boosting crop yields. To 
make the right choice of insecticides and equipment for a particular crop, we should look at the farmer’s 
needs, says this paper. It also suggests some other crops that may be suitable for the farmer’s area. 
A tree-based similarity strategy is used to find users or farmers related to a query user. Compared to other 
similarity metrics, this method is more time-efficient. Accuracy, precision, and positive predictive values 
were used to assess the effectiveness of the suggested systems. In this study, we took only Quang Nam and 
Da Nang into account as geographical locations. Future studies may examine technology adaptation for 
agricultural production and other geographical regions. Using deep neural networks, a machine learning 
technology that is most suitable for large data-sets, can also improve the capability of agricultural 
recommender systems. The stacking of auto-encoders with a short dataset is an effective method to retrain 
deep neural networks. When it comes to generalization, deep neural networks outperform shallow neural 
networks and support vector machines (SVM). Finally, future research the authors conduct is a hybrid 
recommender system. According to that, a system is designed by using various input data. The robust 
features in both collaborative filtering and content-based are combined in a unity system. 
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